Feature Selection for a Rich HPSG Grammar Using Decision Trees

نویسندگان

  • Kristina Toutanova
  • Christopher D. Manning
چکیده

This paper examines feature selection for log linear models over rich constraint-based grammar (HPSG) representations by building decision trees over features in corresponding probabilistic context free grammars (PCFGs). We show that single decision trees do not make optimal use of the available information; constructed ensembles of decision trees based on different feature subspaces show significant performance gains (14% parse selection error reduction). We compare the performance of the learned PCFG grammars and log linear models over the same features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grammar conversion from LTAG to HPSG

We propose an algorithm for the conversion of grammars from an arbitrary FB-LTAG grammar into a strongly equivalent HPSG-style grammar. Our algorithm converts LTAG elementary trees into HPSG feature structures by encoding the tree structures in stacks. A set of pre-determined rules manipulate the stack to emulate substitution and adjunction. We have used our algorithm to obtain HPSG-style gramm...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Parse disambiguation for a rich HPSG grammar

The fine-grained nature of the HPSG representations found in the Redwoods treebank raises novel issues in parse disambiguation relative to more traditional treebanks such as the Penn treebank, which have been the focus of most past work on probabilistic parsing (e.g., Charniak 1997; Collins 1997). The Redwoods treebank is much richer in the representations it makes available. Most similar to Pe...

متن کامل

Algebraic Methods in Language Processing AMiLP-3,

Head-driven Phrase Structure Grammar (HPSG, Pollard and Sag (1987, 1994)) is currently one of the most prominent linguistic theories. A grammar for HPSG is given by a set of abstract language universal principles, a set of language specific principles, and a lexicon. A sentence is grammatical, if it is compatible with all of the principles. The data structures underlying HPSG are so-called feat...

متن کامل

Using an HPSG grammar for the generation of prosodic structures

In this paper, we report on an experiment showing how the introduction of prosodic information from detailed syntactic structures into synthetic speech leads to better disambiguation of structurally ambiguous sentences. Using modifier attachment (MA) ambiguities and subject/object fronting (OF) in German as test cases, we show that prosody which is automatically generated from deep syntactic in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002